home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Tech Arsenal 1
/
Tech Arsenal (Arsenal Computer).ISO
/
tek-01
/
pdsrt321.zip
/
PDSORT.DOC
< prev
next >
Wrap
Text File
|
1991-09-07
|
18KB
|
361 lines
PDSORT
A Public Domain External Sort Program
Author: Don A. Williams
Version: 3.2.1 |
Date: September 7, 1991 |
**************************** NOTICE! **************************
* Contrary to the current trend in MS-DOS software this *
* program, for whatever it is worth, is NOT copyrighted *
* (with the exception of the runtime library from Borland *
* International's Turbo C)! The program, in whole or in *
* part, may be used freely in any fashion or environment *
* desired. If you find this program to be useful to you, *
* do NOT send any contribution to the author; in the words *
* of Rick Conn, 'Enjoy!' However, if you make any *
* improvements, I would enjoy receiving a copy of the *
* modified source. I can be reached, usually within 24 *
* hours, by messages on any of the Phoenix systems, *
* particularly: *
* *
* Technoids Anonymous [PCBOARD] *
* (602) 899-4876 300/1200/2400 bps *
* *
* All can be reached through PC Pursuit. *
* *
* *
* Every effort has been made to avoid error and moderately *
* extensive testing has been performed on this program, *
* however, the author does not warrant it to be fit for any *
* purpose or to be free from error and disclaims any *
* liability for actual or any other damage arising from the *
* use of this program. *
* *
* Don A. Williams *
* 3913 W. Solano Dr. N. *
* Phoenix, AZ 85019 *
* (602) 841-5333 *
*****************************************************************
PDSORT is a fully public domain external sort program, i.e. it can
sort files that are too big to be wholly contained in memory. This
version of PDSORT contains a fully public domain implentation of an
iterative version of qsort(), PDQSORT.C, that is fully compatible with
the ANSI C standard run time qsort(). The qsort() routine supplied
with most of the C compilers is a recursive implementation that is
slower and requires a great deal more stack than the iterative
version. The length of the file that can be sorted by PDSORT is
limited only by the available disk space, however, you must have at
least twice as much free space as the length of the file to be sorted
- PDSORT uses an intermediate file the size of the input file and the
sorted output file will be the same size as the input file.
USAGE:
There are three forms for executing PDSORT:
pdsort in_file out_file [option] [max_rec_length] [key_spec ....]
and:
pdsort -[option] [max_rec_length] [key_spec...] <in_file >out_file
and:
pdsort <in_file >out_file
In the first form, the input_file specification can be any standard
MS-DOS file specification including full path specifications but may
NOT contain "wild cards". The output_file specification may also be
any standard MS-DOS file specification not containing "wild cards".
The input_file and output_file names may be the same but, in this
case, the input_file will be destroyed by PDSORT by overwriting it
with the output file. If no max_record_length argument is
specified, PDSORT will use a default of 256 characters/record. No
records in the input file may exceed the maximum record length. If
a record that exceeds the maximum record length is detected in
the input file, PDSORT will issue a message identifying the record and
will terminate.
In the second form, PDSORT executes as a "filter", reading and sorting
standard input to produce standard output, both of which may be
redirected.
In the third form, PDSORT also executes as a "filter", reading and
sorting standard input to produce standard output, both of which may
be redirected. The difference between the second form and the third
form is that the third form uses the default maximum record size of
256 characters/record and the default sort key of the entire record.
Options:
The only options supported by version 2.1.3 of PDSORT are:
- A single '-', delimited by blanks, instructs PDSORT to
operate as a filter, taking its input from standard input
and sending its output to standard output.
-tpath The '-t' option allows the user to specify the path for the
intermediate file created by PDSORT. If no such path is
specified, PDSORT will use the path of the output file as
the path of the intermediate file. If the "filter" option
is selected and no intermediate file path is specified, the
intermediate file will be created in the current directory.
There must be NO blanks between the "-t" and the path
specification.
-o The '-o' option will suppress all messages except error
messages.
-k Use a "key sort" instead of sorting the entire input file.
If this option is specified, PDSORT will extract the key
fields from the input file for sorting. This option can
alleviate the problem caused by the fact the PDSORT,
currently, has only a single merge pass. Large files of
long records can produce too many runs to be sorted in a
single merge pass, particularly in reduced memory
environments such as executing PDSORT out of another
program. The "key sort" is NOT the complete answer to this
problem and a future version of PDSORT will support a
multi÷pass merge.
Key Specification:
There may be as many keys specified as you wish. The file will be
sorted on the keys in the order in which they are specified. Each
key_spec has one of the following two formats:
b:l:[field_options]
or:
b-e:[field_options]
where:
b - Specifies the beginning character position of the field in
decimal; i.e to sort a field that is in columns 10 through 17
of the record, b would be 10.
l - Specifies the length of the field in characters; i.e to sort a
field that is in columns 10 through 17 of the record, l would
be 8 - 10:8[:options].
e - Specifies the ending character position of the field in
decimal, inclusive; i.e to sort a file that is in characters
10 through 17 of the record with this format of the field
specifier would require the e be 17 - 10-17[:options].
Field Options:
a - Specifies that the sort on this field is to be in ascending
order, the default if no field option is specified for this
field.
d - Specifies that the sort on this field is to be in descending
order.
c - Specifies that the sort on this field is to be case sensitive;
i.e. the word Abscess" would sort lower that the word
"abscess". A case sensitive sort is the default if none
is specified.
i - Specifies that the sort on this field is to be case
insensitive.
c - Specifies that this field is ASCII character data, the default
if not specified. Since PDSORT 2.0.0 supports only ASCII
character fields, this option if only for upward compatibility
with future version of PDSORT that may support other field
types such as integer (numeric).
Error Levels
If PDSORT terminates due to an error condition, it will set the MS-DOS
ERRORLEVEL as follows:
1 - Invalid command argument or option.
2 - Unable to open the specified input file. Possibly incorrect
file name.
3 - Input/Output error on input file.
4 - A record in the input file exceeds the maximum length
5 - Insufficient space on output disk.
6 - Insufficient space on intermediate disk.
7 - Unable to create output file.
8 - Input/Output error on output file.
9 - Unable to create intermediate file.
10 - Input/Output error on intermediate file.
11 - Reopen failure!
12 - Insufficient memory.
On normal termination, the ERRORLEVEL will be set to 0.
Examples:
Assume a file, named FILELIST, that contains a list of file name,
sizes, date/times, and paths, such as can be created by NUFIND:
a----w 27,974 90-04-30 6:30 c:\acelst.430
a----w 28,196 90-05-21 5:05 c:\acelst.521
a----w 28,238 90-05-25 5:41 c:\acelst.525
a----w 26,705 90-05-25 5:39 c:\acelst.lst
a----w 128,537 90-05-25 5:41 c:\acelst.zip
a----w 904 90-05-23 4:33 c:\autoexec.bat
a----w 35,840 89-06-30 12:16 c:\command.com
a----w 46 89-08-20 3:53 c:\config.cal
a----w 284 90-05-21 19:52 c:\config.sys
a----w 2,128 90-05-23 4:32 c:\configur.dat
------ 39,385 89-07-14 12:00 c:\drbdos.sys
------ 18,304 89-06-30 12:16 c:\drbios.sys
---sh- 4,096 90-03-18 13:22 c:\drildr.sys
a----w 167 89-10-11 7:56 c:\dsas.cmd
a----w 102 86-11-04 9:14 c:\ed.def
a----w 50,326 86-10-02 21:34 c:\ed.hlp
a----w 3,362 86-06-09 13:10 c:\fakey.com
a----w 0 90-05-23 7:54 c:\ftf.dat
a----w 12,275 85-06-16 18:12 c:\helpe.def
a----w 79 89-05-17 5:02 c:\indent.pro
a----w 7,122 90-01-15 1:00 c:\lineend.com
a----w 1,060 90-05-25 9:40 c:\mark0
a--sh- 41 90-05-25 9:40 c:\mirorsav.fil
a----- 41,472 90-05-25 9:40 c:\mirror.bak
a----- 41,472 90-05-25 9:40 c:\mirror.fil
a----w 237 90-02-06 5:09 c:\model
a----w 12,432 87-03-10 13:34 c:\mouse.sys
a----w 2,507 87-10-21 13:45 c:\nansi.sys
a----w 251 87-07-16 18:06 c:\newkbios.com
a----w 6,094 89-11-21 15:03 c:\no101.com
a----w 2,836 90-05-03 7:21 c:\phone.dir
a----w 2,670 89-12-01 17:00 c:\phones
a----w 2,010 90-03-18 11:14 c:\pushdir.stk
a----w 90 89-09-05 3:29 c:\ruler.def
a----w 1,465 87-04-22 10:38 c:\ruler.prt
a----w 53,632 85-05-03 14:09 c:\sk.hlp
a----w 33,611 86-12-05 9:21 c:\skn.com
a----w 18,825 90-01-15 1:00 c:\synonym.com
a----w 1,610 90-05-13 16:15 c:\synonym.def
a----w 14,426 90-05-19 17:38 c:\utils
a----w 1,060 90-05-25 12:17 c:\mark1
The file path name begins in column 40 and extends through 80, the
file size is in columns 15 through 23, inclusive, the file date is in
columns 25 through 32, inclusive, and the file time is in columns 34
through 39, inclusive.
A PDSORT command to sort this list so that the largest files are first
and files of equal size are in name order would require the following
command:
pdsort filelist fileout 80 15-23:d 44-80
giving the following list in the file FILEOUT:
a----w 128,537 90-05-25 5:41 c:\acelst.zip
a----w 53,632 85-05-03 14:09 c:\sk.hlp
a----w 50,326 86-10-02 21:34 c:\ed.hlp
a----- 41,472 90-05-25 9:40 c:\mirror.bak
a----- 41,472 90-05-25 9:40 c:\mirror.fil
------ 39,385 89-07-14 12:00 c:\drbdos.sys
a----w 35,840 89-06-30 12:16 c:\command.com
a----w 33,611 86-12-05 9:21 c:\skn.com
a----w 28,238 90-05-25 5:41 c:\acelst.525
a----w 28,196 90-05-21 5:05 c:\acelst.521
a----w 27,974 90-04-30 6:30 c:\acelst.430
a----w 26,705 90-05-25 5:39 c:\acelst.lst
a----w 18,825 90-01-15 1:00 c:\synonym.com
------ 18,304 89-06-30 12:16 c:\drbios.sys
a----w 14,426 90-05-19 17:38 c:\utils
a----w 12,432 87-03-10 13:34 c:\mouse.sys
a----w 12,275 85-06-16 18:12 c:\helpe.def
a----w 7,122 90-01-15 1:00 c:\lineend.com
a----w 6,094 89-11-21 15:03 c:\no101.com
---sh- 4,096 90-03-18 13:22 c:\drildr.sys
a----w 3,362 86-06-09 13:10 c:\fakey.com
a----w 2,836 90-05-03 7:21 c:\phone.dir
a----w 2,670 89-12-01 17:00 c:\phones
a----w 2,507 87-10-21 13:45 c:\nansi.sys
a----w 2,128 90-05-23 4:32 c:\configur.dat
a----w 2,010 90-03-18 11:14 c:\pushdir.stk
a----w 1,610 90-05-13 16:15 c:\synonym.def
a----w 1,465 87-04-22 10:38 c:\ruler.prt
a----w 1,060 90-05-25 9:40 c:\mark0
a----w 1,060 90-05-25 12:17 c:\mark1
a----w 904 90-05-23 4:33 c:\autoexec.bat
a----w 284 90-05-21 19:52 c:\config.sys
a----w 251 87-07-16 18:06 c:\newkbios.com
a----w 237 90-02-06 5:09 c:\model
a----w 167 89-10-11 7:56 c:\dsas.cmd
a----w 102 86-11-04 9:14 c:\ed.def
a----w 90 89-09-05 3:29 c:\ruler.def
a----w 79 89-05-17 5:02 c:\indent.pro
a----w 46 89-08-20 3:53 c:\config.cal
a--sh- 41 90-05-25 9:40 c:\mirorsav.fil
a----w 0 90-05-23 7:54 c:\ftf.dat
The same sort could be accomplished by the command:
pdsort - 80 15-23:d 40-80 <filelist >fileout
Now suppose that you wanted the most recently modified or created files
first. The following command would do that:
pdsort filelist fileout 80 25-32:d 34-39:d 40-80
giving the following list in FILEOUT:
a----w 1,060 90-05-25 12:17 c:\mark1
a----w 1,060 90-05-25 9:40 c:\mark0
a--sh- 41 90-05-25 9:40 c:\mirorsav.fil
a----- 41,472 90-05-25 9:40 c:\mirror.bak
a----- 41,472 90-05-25 9:40 c:\mirror.fil
a----w 28,238 90-05-25 5:41 c:\acelst.525
a----w 128,537 90-05-25 5:41 c:\acelst.zip
a----w 26,705 90-05-25 5:39 c:\acelst.lst
a----w 0 90-05-23 7:54 c:\ftf.dat
a----w 904 90-05-23 4:33 c:\autoexec.bat
a----w 2,128 90-05-23 4:32 c:\configur.dat
a----w 284 90-05-21 19:52 c:\config.sys
a----w 28,196 90-05-21 5:05 c:\acelst.521
a----w 14,426 90-05-19 17:38 c:\utils
a----w 1,610 90-05-13 16:15 c:\synonym.def
a----w 2,836 90-05-03 7:21 c:\phone.dir
a----w 27,974 90-04-30 6:30 c:\acelst.430
---sh- 4,096 90-03-18 13:22 c:\drildr.sys
a----w 2,010 90-03-18 11:14 c:\pushdir.stk
a----w 237 90-02-06 5:09 c:\model
a----w 7,122 90-01-15 1:00 c:\lineend.com
a----w 18,825 90-01-15 1:00 c:\synonym.com
a----w 2,670 89-12-01 17:00 c:\phones
a----w 6,094 89-11-21 15:03 c:\no101.com
a----w 167 89-10-11 7:56 c:\dsas.cmd
a----w 90 89-09-05 3:29 c:\ruler.def
a----w 46 89-08-20 3:53 c:\config.cal
------ 39,385 89-07-14 12:00 c:\drbdos.sys
a----w 35,840 89-06-30 12:16 c:\command.com
------ 18,304 89-06-30 12:16 c:\drbios.sys
a----w 79 89-05-17 5:02 c:\indent.pro
a----w 2,507 87-10-21 13:45 c:\nansi.sys
a----w 251 87-07-16 18:06 c:\newkbios.com
a----w 1,465 87-04-22 10:38 c:\ruler.prt
a----w 12,432 87-03-10 13:34 c:\mouse.sys
a----w 33,611 86-12-05 9:21 c:\skn.com
a----w 102 86-11-04 9:14 c:\ed.def
a----w 50,326 86-10-02 21:34 c:\ed.hlp
a----w 3,362 86-06-09 13:10 c:\fakey.com
a----w 12,275 85-06-16 18:12 c:\helpe.def
a----w 53,632 85-05-03 14:09 c:\sk.hlp
Again, the same sort could have been accomplished by the following
command:
pdsort - 25-32:d 34-39:d 40-80 <filelist >fileout